Computer Science on the Move: Inferring Migration Regularities from the Web via Compressed Label Propagation
نویسندگان
چکیده
Many collective human activities have been shown to exhibit universal patterns. However, the possibility of regularities underlying researcher migration in computer science (CS) has barely been explored at global scale. To a large extend, this is due to official and commercial records being restricted, incompatible between countries, and especially not registered across researchers. We overcome these limitations by building our own, transnational, large-scale dataset inferred from publicly available information on the Web. Essentially, we use Label Propagation (LP) to infer missing geo-tags of author-paper-pairs retrieved from online bibliographies. On this dataset, we then find statistical regularities that explain how researchers in CS move from one place to another. However, although vanilla LP is simple and has been remarkably successful, its run time can suffer from unexploited symmetries of the underlying graph. Consequently, we introduce compressed LP (CLP) that exploits these symmetries to reduce the dimensions of the matrix inverted by LP to obtain optimal labeling scores. We prove that CLP reaches identical labeling scores as LP, while often being significantly faster with lower memory usage.
منابع مشابه
Prediction of user's trustworthiness in web-based social networks via text mining
In Social networks, users need a proper estimation of trust in others to be able to initialize reliable relationships. Some trust evaluation mechanisms have been offered, which use direct ratings to calculate or propagate trust values. However, in some web-based social networks where users only have binary relationships, there is no direct rating available. Therefore, a new method is required t...
متن کاملLoad Balancing Approaches for Web Servers: A Survey of Recent Trends
Numerous works has been done for load balancing of web servers in grid environment. Reason behinds popularity of grid environment is to allow accessing distributed resources which are located at remote locations. For effective utilization, load must be balanced among all resources. Importance of load balancing is discussed by distinguishing the system between without load balancing and with loa...
متن کاملCommunity Detection using a New Node Scoring and Synchronous Label Updating of Boundary Nodes in Social Networks
Community structure is vital to discover the important structures and potential property of complex networks. In recent years, the increasing quality of local community detection approaches has become a hot spot in the study of complex network due to the advantages of linear time complexity and applicable for large-scale networks. However, there are many shortcomings in these methods such as in...
متن کاملInvestigating Dynamic Writing Assessment in a Web 2.0 Asynchronous Collaborative Computer-Mediated Context
This study aims at investigating the effect of dynamic assessment (DA) on L2 writing achievement if applied via blogging as a Web 2.0 tool, as well as examining which pattern of interaction is more conducive to learning in such an environment. The results of the study indicate that using weblogs to provide mediation contributes to the enhancement of the overall writing performance, vocabulary a...
متن کاملGeoDBLP: Geo-Tagging DBLP for Mining the Sociology of Computer Science
Many collective human activities have been shown to exhibit universal patterns. However, the possibility of universal patterns across timing events of researcher migration has barely been explored at global scale. Here, we show that timing events of migration within different countries exhibit remarkable similarities. Specifically, we look at the distribution governing the data of researcher mi...
متن کامل